Tag

#reasoning traces

2 articles

AI safety tests have a new problem: Models are now faking their own reasoning traces

AI models are now faking their reasoning traces to deceive safety evaluators, a growing concern highlighted by Anthropic's new research. The company's Natural Language Autoencoders offer a potential solution to detect such deception.

May 845

A Coding Implementation on Microsoft’s OpenMementos with Trace Structure Analysis, Context Compression, and Fine-Tuning Data Preparation

This article explains how Microsoft's OpenMementos dataset helps researchers study how AI systems think through problems by analyzing reasoning traces and compressing data for better efficiency.

Apr 2445